Developing a discrimination rule between breast cancer patients and controls using proteomics mass spectrometric data: a three-step approach.
نویسندگان
چکیده
To discriminate between breast cancer patients and controls, we used a three-step approach to obtain our decision rule. First, we ranked the mass/charge values using random forests, because it generates importance indices that take possible interactions into account. We observed that the top ranked variables consisted of highly correlated contiguous mass/charge values, which were grouped in the second step into new variables. Finally, these newly created variables were used as predictors to find a suitable discrimination rule. In this last step, we compared three different methods, namely Classification and Regression Tree (CART), logistic regression and penalized logistic regression. Logistic regression and penalized logistic regression performed equally well and both had a higher classification accuracy than CART. The model obtained with penalized logistic regression was chosen as we hypothesized that this model would provide a better classification accuracy in the validation set. The solution had a good performance on the training set with a classification accuracy of 86.3%, and a sensitivity and specificity of 86.8% and 85.7%, respectively.
منابع مشابه
DIAGNOSIS OF BREAST LESIONS USING THE LOCAL CHAN-VESE MODEL, HIERARCHICAL FUZZY PARTITIONING AND FUZZY DECISION TREE INDUCTION
Breast cancer is one of the leading causes of death among women. Mammography remains today the best technology to detect breast cancer, early and efficiently, to distinguish between benign and malignant diseases. Several techniques in image processing and analysis have been developed to address this problem. In this paper, we propose a new solution to the problem of computer aided detection and...
متن کاملتأثیر عوامل مربوط به باروری بر خطر بروز سرطان پستان؛ یک مطالعه مورد - شاهد
Background & Objectives: Breast cancer is a common malignancy in women in many parts of the world. The incidence of breast cancer in Iranian women is growing. Iranian patients are relatively younger than their western counterparts. We conducted a case-control study to determine roles of reproductive factors for breast cancer among women in Iran. Methods: A hospital based case-control study was ...
متن کاملProteomics and bioinformatics approaches for identification of serum biomarkers to detect breast cancer.
BACKGROUND Surface-enhanced laser desorption/ionization (SELDI) is an affinity-based mass spectrometric method in which proteins of interest are selectively adsorbed to a chemically modified surface on a biochip, whereas impurities are removed by washing with buffer. This technology allows sensitive and high-throughput protein profiling of complex biological specimens. METHODS We screened for...
متن کاملThe Association of the MTHFR Gene Polymorphisms with Breast Cancer Susceptibility
Introduction: Breast cancer is the most common malignancy in women worldwide. It is also the second leading cause of cancer death among women after lung cancer. Considering the relationship among plasma folate levels, the level of uracil, and DNA damage in cell division, methyl tetrahydrofolate reductase (MTHFR) is a suitable candidate for studies on the susceptibility to cancer, including brea...
متن کاملStudy on the association of Epstein - Barr virus with breast cancer in Khorramabad breast cancer patients, Iran
Background: Breast cancer is one of the most common malignancies in the world, and early diagnosis of this cancer is a key factor in its treatment. This cancer is a multi-stage disease, in which viruses can play a role. EBV is known as an important factor in the development of some human cancers. Therefore, this study was conducted to determine the relationship between Epstein-Barr virus, EBV, ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Statistical applications in genetics and molecular biology
دوره 7 2 شماره
صفحات -
تاریخ انتشار 2008